A Mutual Subspace Clustering Algorithm for High Dimensional Datasets
نویسندگان
چکیده
Generation of consistent clusters is always an interesting research issue in the field of knowledge and data engineering. In real applications, different similarity measures and different clustering techniques may be adopted in different clustering spaces. In such a case, it is very difficult or even impossible to define an appropriate similarity measure and clustering criteria in the union space. The mutual subspace clustering from multiple clustering spaces is critically different from subspace clustering in one (union) clustering space. Mutual subspace clustering finds the common clusters agreed by subspace clustering in both clustering spaces, which cannot be handled by the traditional subspace clustering analysis. The partitioning model divides points in a data set into k exclusive clusters and a signature subspaces are found for each cluster, where k is the number of clusters desired by a user. This model improves the k means with the elimination of random centroid selection, using average pairwise distance and other parameters to generate consistent clusters. The experimental results have been recorded on cancer data set to state the efficiency of mutual subspace clustering.
منابع مشابه
High-Dimensional Unsupervised Active Learning Method
In this work, a hierarchical ensemble of projected clustering algorithm for high-dimensional data is proposed. The basic concept of the algorithm is based on the active learning method (ALM) which is a fuzzy learning scheme, inspired by some behavioral features of human brain functionality. High-dimensional unsupervised active learning method (HUALM) is a clustering algorithm which blurs the da...
متن کاملA Robust k-Means Type Algorithm for Soft Subspace Clustering and Its Application to Text Clustering
Soft subspace clustering are effective clustering techniques for high dimensional datasets. Although several soft subspace clustering algorithms have been developed in recently years, its robustness should be further improved. In this work, a novel soft subspace clustering algorithm RSSKM are proposed. It is based on the incorporation of the alternative distance metric into the framework of kme...
متن کاملDBSC: A Dependency-Based Subspace Clustering Algorithm for High Dimensional Numerical Datasets
We present a novel algorithm called DBSC, which finds subspace clusters in numerical datasets based on the concept of “dependency”. This algorithm uses a depth-first search strategy to find out the maximal subspaces: a new dimension is added to current k-subspace and its validity as a (k 1)-subspace is evaluated. The clusters within those maximal subspaces are mined in a similar fashion as maxi...
متن کاملIdentifying Information-Rich Subspace Trends in High-Dimensional Data
Identifying information-rich subsets in high-dimensional spaces and representing them as order revealing patterns (or trends) is an important and challenging research problem in many science and engineering applications. The information quotient of large-scale high-dimensional datasets is significantly reduced by the curse of dimensionality which makes the traditional clustering and association...
متن کاملLeveraging Union of Subspace Structure to Improve Constrained Clustering
Many clustering problems in computer vision and other contexts are also classification problems, where each cluster shares a meaningful label. Subspace clustering algorithms in particular are often applied to problems that fit this description, for example with face images or handwritten digits. While it is straightforward to request human input on these datasets, our goal is to reduce this inp...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2016